CSAR: Cluster Storage with Adaptive Redundancy

نویسندگان

  • Manoj Pillai
  • Mario Lauria
چکیده

Striped file systems such as the Parallel Virtual File System (PVFS) deliver high-bandwidth I/O to applications running on clusters. An open problem of existing striped file systems is how to provide efficient data redundancy to decrease their vulnerability to disk failures. In this paper we describe CSAR, a version of PVFS augmented with a novel redundancy scheme that addresses the efficiency issue while using unmodified stock file systems. By dynamically switching between RAID1 and RAID5 redundancy based on write size, CSAR achieves RAID1 performance on small writes, and RAID5 efficiency on large writes. On a microbenchmark, our scheme achieves identical read bandwidth and 73% of the write bandwidth of PVFS over 7 I/O nodes. We describe the issues in implementing our new scheme in a popular striped file system such as PVFS on a Linux cluster with a high performance I/O subsystem.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CSAR-2: a Case Study of Parallel File System Dependability

Modern cluster file systems such as PVFS that stripe files across multiple nodes have shown to provide high aggregate I/O bandwidth but are prone to data loss since the failure of a single disk or server affects the whole file system. To address this problem a number of distributed data redundancy schemes have been proposed that represent different trade-offs between performance, storage effici...

متن کامل

CSAR-2: A Case Study of Parallel File System Dependability Analysis

Modern cluster file systems such as PVFS that stripe files across multiple nodes have shown to provide high aggregate I/O bandwidth but are prone to data loss since the failure of a single disk or server affects the whole file system. To address this problem a number of distributed data redundancy schemes have been proposed that represent different trade-offs between performance, storage effici...

متن کامل

A High Performance Redundancy Scheme for Cluster File Systems

A known problem in the design of striped file systems is their vulnerability to disk failures. In this paper we address the challenges of augmenting an existing file system with traditional RAID redundancy, and we propose a novel hybrid redundancy scheme designed to maximize disk throughput as seen by the applications. To demonstrate the hybrid redundancy scheme we build CSAR, a proof-of-concep...

متن کامل

A Dynamic Deduplication Approach for Big Data Storage

As data is increasing every day, so it is very challenging task to manage storage devices for this explosive growth of digital data. Data reduction has become very crucial problem. Deduplication approach plays a vital role to remove redundancy in large scale cluster computing storage. As a result, deduplication provides better storage utilization by eliminating redundant copies of data and savi...

متن کامل

Visualization of Time-Dependent Adaptive Mesh Refinement Data

Analysis of phenomena that simultaneously occur on quite different spatial and temporal scales require adaptive, hierarchical schemes to reduce computational and storage demands. For data represented as grid functions, the key are adaptive, hierarchical, time-dependent grids that resolve spatio-temporal details without too much redundancy. Here, so-called AMR grids gain increasing popularity. F...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003